Automatic Text Categorization of Mathematical Word Problems
نویسندگان
چکیده
This paper describes a novel application of text categorization for mathematical word problems, namely Multiplicative Compare and Equal Group problems. The empirical results and analysis show that common text processing techniques such as stopword removal and stemming should be selectively used. It is highly beneficial not to remove stopwords and not to do stemming. Part of speech tagging should also be used to distinguish words in discriminative parts of speech from the non-discriminative parts of speech which not only fail to help but even mislead the categorization decision for mathematical word problems. An SVM classifier with these selectively used text processing techniques outperforms an SVM classifier with a default setting of text processing techniques (i.e. stopword removal and stemming). Furthermore, a probabilistic meta classifier is proposed to combine the weighted results of two SVM classifiers with different word problem representations generated by different text preprocessing techniques. The empirical results show that the probabilistic meta classifier further improves the categorization accuracy.
منابع مشابه
The characteristics of mathematical word problems at the middle school and suggested strategies to facilitae their solution process
Abstract: This paper, first it has reviewed the literature on the characteristics of mathematical word problems and their solution process. The review revealed that among the root causes for students’ difficulties with mathematical word problems, two factors are salient, namely the text complexity and the unfamiliar context. To shed more light on these findings, a factorial experimental study w...
متن کاملHarnessing the Expertise of 70, 000 Human Editors: Knowledge-Based Feature Generation for Text Categorization
Most existing methods for text categorization employ induction algorithms that use the words appearing in the training documents as features. While they perform well in many categorization tasks, these methods are inherently limited when faced with more complicated tasks where external knowledge is essential. Recently, there have been efforts to augment these basic features with external knowle...
متن کاملThe Role of Word Sense Disambiguation in Automated Text Categorization
Automated Text Categorization has reached the levels of accuracy of human experts. Provided that enough training data is available, it is possible to learn accurate automatic classifiers by using Information Retrieval and Machine Learning Techniques. However, performance of this approach is damaged by the problems derived from language variation (specially polysemy and synonymy). We investigate...
متن کاملAn automated arabic text categorization based on the frequency ratio accumulation
Compared to other languages, there is still a limited body of research which has been conducted for the automated Arabic Text Categorization (TC) due to the complex and rich nature of the Arabic language. Most of such research includes supervised Machine Learning (ML) approaches such as Naïve Bayes (NB), K-Nearest Neighbour (KNN), Support Vector Machine and Decision Tree. Most of these techniqu...
متن کاملThe learning vector quantization algorithm applied to automatic text classification tasks
Automatic text classification is an important task for many natural language processing applications. This paper presents a neural approach to develop a text classifier based on the Learning Vector Quantization (LVQ) algorithm. The LVQ model is a classification method that uses a competitive supervised learning algorithm. The proposed method has been applied to two specific tasks: text categori...
متن کامل